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IV 


ABSTRACT 


This  study  identifies  factors  affecting  the  performance  of  commercial- off-the- 
shelf  speech  recognition  software  (SRS)  when  used  for  ship  control  purposes.  After  a 
review  of  research  in  the  feasibility  and  acceptability  of  SRS -based  ship  control,  the 
paper  examines  the  effects  of: 

•  A  restricted  vocabulary  versus  a  large  vocabulary, 

•  Low  experience  level  conning  officers  versus  high  experience  level 
conning  officers, 

•  Male  versus  female  voices, 

•  Pre-test  training  on  specific  words  versus  no  pre-test  training. 

Controlled  experimentation  finds  that: 

•  The  experience  level  of  a  conning  officer  has  no  significant  impact  on 
SRS  performance. 

•  Female  participants  experienced  more  SRS  errors  than  did  their  male 
counterparts.  However,  in  this  experiment,  only  a  limited  number  of  trials 
were  available  to  assess  a  difference. 

•  SRS  with  restricted  vocabulary  performs  no  better  than  SRS  with  large 
vocabularies. 

•  Using  the  software  “correct  as  you  go”  feature  may  impact  software 
performance.  Following  the  user  profile  establishment,  individual  user 
training  on  two  specific  words  reduces  error  rates  significantly. 

This  study  concludes  that  SRS  is  a  viable  technology  for  ship  control  and  merits  further 
testing  and  evaluation. 
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1.  BACKGROUND 


A.  INTRODUCTION 

In  recent  years,  the  U.S.  Navy  has  begun  to  search  for  ways  to  decrease  the 
number  of  personnel  necessary  to  operate  a  ship  at  sea.  Initiatives  such  as  “Smart  Ship” 
design  and  the  ongoing  “Optimal  Manning  trials”  are  designed  to  show  that  ships  can 
operate  at  sea  with  reduced  manning.  [Ref.  1]  Ship  designers  actively  seek-out 
manpower- saving,  technology-based  options.  Technological  advances  have  not  only 
reduced  manning,  but  in  many  cases  enhanced  the  ability  of  watchstanders  to  conduct 
their  duties.  For  example,  engineering  watchstanders  use  handheld  computers  to  record 
engineering  plant  data  that  is  then  downloaded  to  a  computer  which  automatically 
generates  required  reports.  Combat  watchstanders  employ  touch-screen  technology  and 
automated  display  screens  to  speed  the  process  of  data  entry  and  display.  More  efficient 
and  accurate  computerized  navigation  systems  enable  quartermasters  to  plan  and  plot 
ship  movements.  [Ref.  2] 

Ship  control,  however,  remains  an  area  that  seems  relatively  untouched  by 
technological  advances,  as  traditions  developed  long  before  the  birth  of  the  U.S.  Navy 
still  remains  in  place.  This  thesis  explores  one  technology  alternative,  building  upon 
previous  research  in  investigating  the  viability  of  using  speech  recognition  software 
(SRS)  aboard  naval  vessels  for  ship  control  purposes,  and  analyzing  the  system’s 
potential  in  eliminating  the  need  for  two  bridge  watchstanders,  the  helmsman  and  the  lee 
helmsman. 

Chapter  I  establishes  a  foundation  of  knowledge  by  examining  the  background 
and  historical  information  related  to  speech  recognition  technology.  Chapter  II  describes 
how  SRS  is  applicable  to  naval  vessels  and  considers  potential  barriers  to  its 
employment.  Chapter  III  delineates  the  methodology  utilized  in  an  experiment  designed 
to  show  sources  of  performance  variation  and  potential  avenues  to  reduce  SRS  error 
rates.  Experimental  results  and  their  analysis  are  presented  in  Chapter  IV;  and  finally. 
Chapter  V  summarizes  findings,  makes  recommendations  and  proposes  areas  of  future 
research 
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B.  SPEECH  RECOGNITION  TECHNOLOGY 

Speech  Recognition  Software  (SRS),  also  called  Voice  Recognition  Software 
(VRS),  enables  a  computer  to  convert  a  spoken  word  (an  acoustic  signal)  into  text  which 
is  represented  within  the  computer  by  binary  digits.  At  the  heart  of  the  software  is  an 
analog- to- digital  converter  which  digitizes  the  incoming  analog  signal  and  divides  it  into 
10  to  20  millisecond  frames.  [Ref.  3]  These  frames  are  then  compared  to  a  digital  library 
stored  in  memory. 

Speech  recognition  systems  focus  on  words  and  the  sounds  that 
distinguish  one  word  from  another  in  a  language.  Those  sounds  are  called 
phonemes.  The  words  “seat,”  “beat,”  and  “Cheat”  are  different  words 
because,  in  each  case,  the  initial  sound  is  recognized  as  a  separate 
phoneme  in  English.  [Ref.  4] 

The  lexicon  library  contains  phoneme  models  which  define  the  pronunciation  of  a 
word  as  well  as  its  length.  It  may  also  contain  multiple  pronunciations  of  the  same  word 
to  account  for  regional  differences  in  pronunciation.  The  “matching”  process  does  not 
seek  out  an  exact  phoneme  match  but  rather  looks  for  the  best  match.  Using  a  procedure 
known  as  Stochastic  Processing,  incoming  signals  are  compared  to  a  set  of  potential 
candidates  using  Hidden  Markov  Models  (HMM),  which  provide  a  way  to  represent  the 
likelihood  of  a  transition  from  one  phoneme  to  the  next  in  a  given  word. 

These  comparisons  produce  a  probability  score  indicating  the  likelihood 
that  a  particular  stored  HMM  reference  model  is  the  best  match  for  the 
input.  [Ref.  5] 

This  approach  allows  the  computer  to  focus  on  the  shape  of  the  vocal  tract  and  make 
allowances  for  extraneous  information  and  slight  differences  that  occur  each  time  a  given 
word  is  spoken. 

The  adaptability  of  SRS  technology  is  one  of  its  strengths.  SRS  technology  may 
be  incorporated  into  a  Voice  Activated  Command  System  (VACS)  which  uses  the  digital 
signal  output  to  control  other  electronics  or  machinery.  SRS  software  also  has  several 
parameters  that  can  be  adjusted  based  on  the  needs  of  the  user.  These  parameters  and  the 
range  of  the  adjustment  are  shown  in  Table  1.  The  parameter  settings  utilized  in  this 
thesis  research  include  speaking  mode  and  style,  user  enrollment,  vocabulary  and 
language  sensitivity. 
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PARAMETER 

RANGE 

Speaking  Mode 

Isolated  to  Continuous 

Speaking  Style 

Scripted  to  Spontaneous 

Enrollment 

Speaker  Dependent  to  Speaker  Independent 

Vocabulary 

Small  to  Earge 

Eanguage  model 

Einite  State  to  Context  Sensitive 

Table  1.  SRS  Parameters  from  Ref.  6 


This  thesis  focuses  on  the  continuous  speaking  mode,  allowing  the  user  to  speak 
naturally  as  opposed  to  pausing  between  each  word  when  using  an  isolated  speaking 
mode.  All  verbal  orders  given  on  the  bridge  of  a  ship  consist  of  short  phrases  that  are 
spoken  naturally.  For  this  reason,  a  system  using  an  isolated  speaking  mode  would  be 
ineffective  for  ship  control  purposes. 

A  scripted  speaking  style  is  designed  for  users  who  will  read  information  and 
avoid  verbal  irregularities  such  as  verbal  pauses  (“uhs”  and  “urns”).  A  spontaneous 
speaking  style  is  more  characteristic  of  the  bridge  of  a  ship  and  hence  will  be  explored  in 
this  thesis.  The  software  can  be  “trained”  to  filter  out  verbal  irregularities  as  described 
below. 

SRS  software  is  available  in  Speaker- independent  and  Speaker- dependent 
varieties.  This  thesis  focuses  on  a  Speaker- dependent  system.  A  speaker  independent 
system  is  capable  of  recognizing  the  voices  of  many  different  speakers,  whereas  a 
speaker  dependent  system  is  trained  to  specific  voices.  [Ref.  7]  The  process  of  training 
the  system  to  a  specific  individual  is  often  referred  to  as  “setting  up  a  user  profile”.  Each 
user  sets  up  a  profile  by  repeating  a  set  of  words  or  phrases  multiple  times  so  that  the 
software  can  create  a  baseline  model  of  the  user’s  speech  patterns.  The  model  allows  for 
a  certain  degree  of  variability  such  as  pitch  and  or  pace  changes,  raspy  voices,  and  other 
non  standard  speech  tendencies  and  it  accounts  for  slight  differences  that  may  occur  each 
time  the  speaker  speaks. 
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The  size  of  the  vocabulary  utilized  by  most  SRS  b  adjustable.  The  vocabulary, 
sometimes  referred  to  as  the  library,  is  a  list  of  words  that  the  software  can  recognize. 
Small  vocabularies  contain  less  than  1000  words  while  most  large  vocabularies  can 
handle  up  to  70,000  words.  The  size  of  the  vocabulary  selected  is  dependent  on  the  task 
to  be  accomplished.  Dictating  a  legal  memo  for  example  would  require  a  much  larger 
vocabulary  than  generating  a  basic  grocery  list.  SRS  is  more  efficient  and  accurate  when 
a  small  vocabulary  is  used  because  there  are  fewer  alternatives  from  which  the  computer 
chooses.  [Ref.  8]  For  this  reason,  this  study  examines  SRS  performance  using  a  small 
vocabulary.  The  specialized  orders  used  for  driving  naval  ships  are  called  Standard 
Commands.  They  consist  of  a  very  limited  rumber  of  words  in  a  specific  order  (See 
Appendix  A)  which  make  them  ideally  suited  for  use  in  a  limited  vocabulary  SRS. 
Chapter  II  will  provide  more  details  regarding  the  use  of  Standard  Commands. 

Another  parameter  difference  that  can  exist  between  speech  recognition  systems 
is  the  use  or  non-use  of  a  language  model.  A  context-sensitive  language  model  will 
inspect  the  surrounding  words  in  order  to  determine  which  word  to  insert.  Often  a 
statistical  language  model  determines  the  estimated  frequency  of  word  usage  and  selects 
the  most  probable  sequence  of  words.  [Ref.  9]  The  SRS  software  used  in  this  study 
includes  a  built-in  language  model. 


C.  PRESENT  DAY  SPEEC  H  RECOGNITION  SOFTWARE  USES 

Advancements  in  speech  recognition  technology  have  made  it  useful  for  a  variety 
of  commercial  and  private  uses  including:  dictation,  personal  computer  interfaces, 
inventory  maintenance,  automated  telephone  services  and  special  purpose  industrial 
applications.  [Ref.  10]  Even  items  that  are  as  small  as  cellular  phones  and  personal  data 
assistants  are  now  capable  of  recognizing  hundreds  of  words.  In  the  home,  speech 
recognition  software  simplifies  the  man- machine  interface  by  allowing  for  verbal  control 
of  such  items  as  televisions,  household  lighting,  environmental  controls,  [Ref.  11]  and 
stereo  systems.  [Ref.  12] 

The  Department  of  Defense  has  also  taken  an  interest  in  the  applications  of  SRS 
technology.  SRS  hands-free,  heads-up  nature  makes  it  ideal  for  military  applications.  In 
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addition,  its  resulting  man-power  saving  attributes  have  led  to  its  use  in  training 
simulators.  For  example,  the  U.S.  Navy  Surface  Warfare  Officers  School  (SWOS)  in 
Newport,  Rhode  Island,  now  uses  a  Voice  Activated  Bridge  Simulator  to  teach  ship 
handling  skills  to  newly  commissioned  officers.  In  Groton,  Connecticut,  the  U.S.  Navy 
Submarine  School  has  reduced  staffing  needs  for  its  Virtual  Submarine  trainer  by 
introducing  SRS  technology.  Personnel  involved  with  submarine  simulator  operations 
perceive  value  in  SRS. 

Voice  recognition  and  synthesis  software  allow  the  student  to  interact  with 
a  computer- generated  navigator,  helmsman,  and  engineering  officer  of  the 
watch.  The  students  can  issue  commands  that  the  computer  sub 
recognizes  and  responds  to  just  as  humans  would.  [Ref.  13] 

Note  that  these  systems  are  speaker  independent  and  do  not  require  users  to  set  up 
profiles  before  use.  As  a  result,  they  may  be  more  susceptible  to  errors  caused  by  accents 
and  rises  in  pitch  due  to  excitement  in  response  to  simulated  hazardous  situations.  This 
reinforces  a  standard,  consistent  form  among  conning  officers. 

In  the  training  environment,  there  is  an  added  value  to  using  speaker 
independent  systems.  They  force  students  to  learn  to  remain  calm  on  the 
bridge  and  give  verbal  orders  in  a  clear,  crisp  voice.  [Ref.  14] 

However,  natural  variability  in  human  performance  is  a  reality  in  the  fleet.  A 
robust,  reliable  VACS  will  have  to  respond  to  orders  accurately.  To  do  so  it  will  need  to 
account  for  speaker  dependent  variation. 


D.  PREVIOUS  SRS  RESEARCH 

In  February  2001,  Ingall’s  Shipbuilding  conducted  an  experiment  to  test  the 
usefulness  of  an  Integrated  Bridge  System  (IBS)  that  they  had  developed.  An  integral 
part  of  the  IBS  was  VACS.  Even  though  the  purpose  of  their  test  did  not  focus  on 
VACS,  the  study  yielded  insight  regarding  SRS. 

•  Participants  in  the  study  preferred  VACS  to  normal  control  methods  but 
agreed  that  there  needed  to  be  the  ability  for  the  conning  officer  to  take 
manual  control  if  necessary. 

•  The  testing  also  revealed  a  need  for  a  standard  command  vocabulary  to  be 
built  into  the  VACS. 
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•  Finally,  tests  showed  that  there  needed  to  be  some  type  of  resolution  for  a 
misinterpreted  command  so  that  the  system  would  not  take  incorrect 
action  or  fail  to  respond.  [Ref.  15] 

A  June  2003  Naval  Postgraduate  School  (NPS)  study  developed  an  experiment 
designed  to  show  the  reliability  of  commercial- off- the -shelf  (COTS)  speech  recognition 
software  [Ref.  16].  It  used  a  commercially  available  SRS  called  Dragon  Naturally  Speak, 
version  6.0  (DNSv6.0)  to  record  the  verbal  orders  of  conning  officers  who  were  driving  a 
simulated  ship.  DNSv6.0  is  a  continuous,  spontaneous,  speaker- dependent  system  that 
utilized  a  large  vocabulary  of  over  20,000  words  and  had  a  built  in  language  model. 

The  experiment  took  place  in  the  Marine  Safety  International  (MSI)  San  Diego, 
California,  shipboard  simulator  and  used  experienced  ship  handlers  as  test  subjects. 
Using  a  wireless  microphone,  test  subjects  transmitted  verbal  commands  to  a  nearby 
laptop  computer  which  then  used  DNSv6.0  to  convert  the  verbal  orders  to  text.  These 
text  files  were  later  analyzed  for  errors  made  by  the  software  [Ref.  17].  The  research 
conclusions  provide  insight  into  the  use  of  SRS  for  ship  control  purposes. 

Results  varied  based  on  who  used  the  SRS.  Some  subjects  seemed  to  be  able  to 
speak  more  clearly  than  others  and  therefore  had  fewer  errors.  The  study  hypothesized 
that  additional  system  training  for  each  test  subject  could  potentially  eliminate  some  of 
this  variability. 

Second,  the  results  demonstrated  that  the  operational  scenario  had  no  impact  on 
the  system  performance.  In  other  words,  it  did  not  matter  if  the  test  was  conducted  on  a 
simulated  Destroyer,  Frigate  or  Cruiser.  Further,  it  did  not  matter  if  the  simulated  ship 
was  entering  port,  leaving  port,  engaged  in  open  ocean  transit  or  any  combination  of 
these. 

This  study  also  revealed  that  the  ambient  noise  level  of  the  setting  influenced  SRS 
performance.  While  the  SRS  profiles  were  developed  in  a  relatively  quiet  room,  the 
experiment  was  conducted  in  a  simulator  with  increased  ambient  noise.  Subsequent 
analysis  pointed  out  that  initial  profile  development  and  experiment  conduct  for  each  test 
subject  should  take  place  in  the  same  setting  to  “teach”  the  system  to  filter  out  any 
ambient  noise  present.  [Ref.  18] 
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Based  on  the  lessons  learned  from  Ingall’s  Ship  Building  and  previous  NFS 
research,  follow-on  research  is  necessary  to  better  determine  the  viability  of  COTS  SRS 
as  used  for  ship  control  purposes.  In  the  chapters  that  follow,  this  thesis  documents  that 
follow-on  research.  First,  however,  it  is  important  to  discuss  why  this  research  is  of 
interest  to  the  U.S.  Navy. 
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11.  APPLICABILITY  TO  NAVAL  VESSELS 


A.  WATCHSTANDING 

A  U.S.  Naval  ship  typically  has  eight  watchstanders  manning  the  bridge  while 
underway  at  sea.  The  watchstander  positions  and  their  duties  can  be  found  in  Table  2 
[Ref.  19].  Some  ships  may  modify  this  list  by  adding  to  or  subtracting  from  these 
positions  based  on  the  vessel  traffic  density,  visibility  and  other  navigationally  significant 
circumstances. 


^POSITION 


wmmmmmwmmmwmmmwmm-Mmmmmmmmmii 

p  Officer  of  the  Deck  (OOD) 

I 


Represents  the  Captain  and  makes 
decisions  regarding  the  safe  operation  of 
the  ship. 


I  Junior  Officer  of  the  Deck 
I  (JOOD) 


OOD  in  training  -  usually  handles  tactical 
communications  and  computes 
maneuvering  solutions. 


p  Conning  Officer  (CONN) 


Issues  rudder  and  propulsion  orders  to  the 
helmsman  and  lee  helmsman. 


I  Boatswains  Mate  of  the  Watch  f 

I  (BMOW)  i 

i-r; - rr — ^ - s 


Supervises  the  enlisted  watch  team. 
Usually  a  qualified  master  helmsman. 


p  Quartermaster  of  the  Watch 
I  (QMOW) 


Navigates  the  ship  and  keeps  the  deck  log. 
Usually  qualified  as  a  helmsman. 


^  Helmsman 

t - 

S  Lee  Helmsman 

I 


Carries  out  the  rudder  orders  of  the  conning 
officer  by  steering  the  ship. 


Carries  out  the  propulsion  orders  of  the 
conning  officer  by  making  speed 
adjustments. 


t  Phone  Talker 


Maintains  communications  between  vital 
stations. 


wmmmmmMmmwmmMmmmmmmmmmmmmmi 


Table  2.  Bridge  Watch  Stations 


This  thesis  will  focus  on  the  Conning  Officer,  the  Helmsman,  and  the  Lee 
Helmsman  watchstanders.  The  interaction  among  these  three  individuals  is  meant  to 
ensure  that  no  order  is  misunderstood.  Each  order  given  by  the  conning  officer  is 
repeated  back  verbatim  to  ensure  complete  understanding  by  the  helmsman  or  lee 
helmsman.  In  this  fashion,  immediate  corrective  action  can  be  taken  if  any  order  is 
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misunderstood.  This  system  of  repeat-backs  also  serves  two  other  important  purposes. 
First,  it  aids  accountability  by  enabling  the  quartermaster  of  the  watch  to  record  each  of 
the  conning  officer’s  orders  in  the  deck  log.  Second,  it  helps  everyone  on  the  bridge 
watch  team  maintain  awareness  regarding  the  status  of  ship  maneuvers. 

Orders  issued  by  the  conning  officer  are  standard  commands,  the  exact  words  and 
the  sequence  of  which  are  formalized  on  all  naval  warships.  The  list  of  standard 
commands  can  be  found  in  Appendix  D.  Note  that  it  is  a  relatively  small  vocabulary 
totaling  fewer  than  100  words.  The  exact  number  depends  on  the  ship  type.  This  small 
vocabulary  makes  ship  driving  a  strong  candidate  for  speech  recognition  software 
implementation.  Newly  commissioned  Surface  Warfare  Officers  (ship  drivers)  undergo 
extensive  training  to  learn  to  use  the  standard  commands  properly.  By  the  time  an  officer 
has  completed  the  qualification  process,  the  standard  commands  are  as  second- nature  as 
speaking. 

B.  REDUCED  MANNING  ISSUES 

In  pursuit  of  reducing  the  manpower  requirements  to  operate  a  ship  at  sea,  the 
Navy  also  reduces  ship  life- cycle  costs  [Ref.  20].  There  are  however,  additional  reasons 
for  reducing  ship  manning  requirements.  Many  ships  in  the  Navy  today  are  unable  to 
meet  their  allocated  manning  levels  and  watch  station  requirements.  [Ref.  21]  An 
undermanned  ship  is  more  prone  to  manpower  fatigue,  has  little  room  for  training 
replacement  personnel  and  has  the  risks  associated  with  reduced  redundancy,  potentially 
affecting  the  safety  of  the  ship  itself.  As  of  2001,  ninety- one  percent  of  all  mishaps 
reported  to  the  Naval  Safety  Center  were  caused  by  human  error.  In  many  of  these  cases, 
improper  training  or  fatigue  played  a  role.  [Ref.  22]  In  addition  to  saving  money, 
reducing  manning  requirements  through  the  installation  of  technology  may  also  alleviate 
current  shortages,  thereby  making  ships  safer. 

The  course  of  action  prescribed  by  the  Naval  Transformation  Roadmap  is  to 
“...insert  technology  to  carry  out  operations  in  ways  that  profoundly  improve  current 
capabilities  and  develop  desired  future  capabilities.”  [Ref.  23]  Aligned  with  this  guidance 
is  the  Smart  Ship  program  which  was  developed  to  reduce  shipboard  personnel  numbers 

by  inserting  technology  that  replaces  watchstanders.  The  results  of  this  initiative  have 
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been  so  successful  that  newer  ships  are  being  designed  with  more  technology  and  smaller 
crew  sizes.  A  prime  example  is  the  Navy’s  Littoral  Corrbat  Ship  (LCS)  which  is  still 
being  developed,  but  project  decision  makers  envision  dramatically  reduced  manning 
levels.  Senior  naval  officials  acknowledge  that  it  is  only  a  matter  of  time  before  this  move 
to  replace  watchstanders  with  technology  affects  the  way  navy  warships  man  their  bridge 
watch  teams  through  “...  a  significant  reduction  in  bridge  manning  needs.”  [Ref.  24] 

In  his  2004  guidance  the  Chief  of  Naval  Operations  stated,  “As  our  Navy 
becomes  more  high  tech,  our  workforce  will  get  smaller  and  smarter.”  [Ref.  25]  His 
words  rang  true  in  January  2004  as  1,900  billets  were  trimmed  from  the  fleet.  The  2005 
budget  includes  further  plans  to  eliminate  sailor  and  officer  jobs  throughout  the  Navy. 
[Ref.  26]  As  Admiral  Clark  puts  it,  “...we  do  not  want  to  spend  one  extra  penny  for 
manpower  we  do  not  need.”  [Ref.  27]  The  cuts  are  enabled  by  the  elimination  of 
redundant  functions  and  the  installation  of  manpower- saving  technology.  The  CNO 
wants  to  “look  at  options  for  carrying  out  midterm  modernization  on  all  the  Navy’s 
surface  ships.”  [Ref.  26] 


C.  TECHNICAL  FEASABILITY  &  IMPLEMENTATION 

Use  of  SRS  for  ship  control  purposes  could  eliminate  the  helmsman  and  lee 
helmsman  watch  stations  during  open-ocean  steaming.  The  purpose  of  this  study  is  to 
assess  the  software  technology’s  ability  to  replace  these  watchstanders.  VACS  could  be 
faster  and  more  accurate  than  a  human  watchstander  as  well. 

...  Voice  Recognition  system  devices  (the  system’s  hardware)  would  be 
physically  installed  into  the  ship’s  current  Ship’s  Control  Console... 
connected  electronically  from  the  SCC  to  the  engineering  propulsion  and 
steering  systems  for  immediate  responses  to  the  Conning  Officer’s  orders. 

The  Conning  Officer  and  Officer  of  the  Deck  would  both  be  equipped 
with  cordless  microphone  headsets  that  would  have  attached  activation 
switches  allowing  navigational  commands  to  be  given  on  demand.  [Ref. 

28] 

In  order  to  maintain  the  current  checks  and  balances  between  the  Conning  Officer  and  the 
helmsman  or  lee  helmsman. 
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...the  VR  system  would  be  equipped  with  a  series  of  speakers  installed 
throughout  the  ship’s  bridge.  The  purpose  of  the  bridge  speakers  is  to 
broadcast  orders  given  by  the  Conning  Officer  as  well  as  the  repeat-back 
by  the  VR  system.  This  enables  all  bridge  watch  standers  to  hear  the 
orders  and  repeat-backs,  allowing  them  to  maintain  situational  awareness 
as  to  how  the  ship  is  being  driven  and  to  anticipate  the  ship’s  actual 
movements.  The  speakers  will  also  serve  to  provide  a  means  for  the  VR 
system  to  repeat  back  the  ordered  command.  [Ref.  29] 

The  system  could  also  be  programmed  to  ask  the  conning  officer  to  repeat  the  command 
(e.g.,  “Orders  to  the  Helm?”)  if  the  system  did  not  find  a  match  in  its  standard  command 
library. 

A  final  issue  to  consider  when  implementing  a  VACS  is  casualty  control.  As 
suggested  by  the  Ingall’s  IBS  testing  a  quick  disconnect  button  is  necessary  so  that  any 
time  the  need  arises  the  ship  can  return  to  manual  mode.  As  with  most  other  vital 
shipboard  equipment,  a  monitoring  and  alarm  panel  would  enable  instant  fault  detection, 
prompting  bypass  of  VACS.  Upon  bypass  of  VACS,  another  bridge  watchstander  could 
step  in  and  execute  the  functions  of  helmsman  and  or  lee  helmsman. 

Implementation  of  SRS  on  the  bridge  of  Navy  ships  is  technically  feasible  and 
may  actually  prove  more  efficient  than  the  manual  control  methods  currently  in  place. 
Further,  such  a  system  causes  very  few  procedural  changes  to  bridge  watch  standing 
while  aiding  the  ongoing  effort  to  reduce  the  number  of  personnel  required  to  operate  a 
ship  at  sea.  There  is  however,  resistance  to  the  idea  of  using  SRS  onboard  naval  ships. 
This  resistance  is  well  documented. 

D.  PSYCHOLOGICAL  BARRIERS 

One  of  the  greatest  obstacles  to  implementing  this  new  technology  is  the  human 
resistance  to  change.  [Ref  30]  The  Navy  is  an  organization  based  on  longstanding 
traditions  with  bureaucratic  forces  that  encourage  maintaining  the  status  quo.  Leaders 

that  fail  to  uphold  the  traditional  way  of  doing  things  are  seen  as  “risk  takers”.  However, 
it  is  precisely  these  “risk  takers”  who  may  enable  innovations  and  progress  in  the  Navy. 
[Ref.  31] 
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In  an  October  2002  study,  110  Surface  Warfare  Officers  ranging  in  rank  from 
Ensign  to  Captain  were  asked  if  they  would  allow  a  voice  activated  control  system  aboard 
ships.  Eighty  percent  of  these  said  that  they  would  allow  it.  Many  added  that  initially  its 
use  should  be  limited  to  certain  circumstances.  [Ref. 3 2]  The  condition  most  often  stated 
as  a  qualifier  to  VACS  use  was  that  the  ship  be  in  open  ocean  transit  with  no  ether 
vessels  nearby.  Most  respondents  also  said  that  with  time  and  proven  reliability 
restrietions  to  use  could  be  relaxed.  [Ref.  33] 

The  remaining  twenty  percent  of  respondents  stated  that  they  would  not  endorse 
the  use  of  VACS  onboard  Navy  ships.  [Ref.  34]  Reasons  given  for  not  wanting  to 
implement  VACS  included  the  perceived  increased  risks  associated  with  “letting  a 
computer  drive  the  ship”  and  the  lack  of  human  interaction  between  the  helmsman  and 
conning  officer.  [Ref.  35]  Respondents  suggested  that  having  a  helmsman  in  the  loop 
added  an  additional  safety  check  in  driving  the  ship,  because  a  good  helmsman  may  catch 
an  error  made  by  the  conning  officer. 

The  first  of  these  two  arguments  has  little  merit  as  computers  are  used  for  a 
number  of  risk  inherent  activities.  The  Aegis  computer  system  can  be  trusted  to  defend 
the  ship  in  battle.  Analogously,  a  VACS  with  similar  redundancies  and  safeguards  could 
relay  the  conning  officer’s  orders  to  the  engines  and  rudders.  The  second  argument 
regarding  helmsman  and  conning  officer  interaction  has  some  validity.  However,  even  if 
the  helmsman  were  not  present,  other  personnel  on  the  bridge  could  alert  the  conning 
officer  to  an  erroneous  decision;  specifically,  the  Officer  of  the  Deck  or  an  alert 
Quartermaster  of  the  wateh.  Additionally,  the  current  speeeh-to-text  capability  of  SRS 
will  alleviate  quartermaster  deck  log  duties,  allowing  for  greater  oversight. 

A  conning  officer  may  not  understand  how  a  VACS  system  works  and  therefore 
feel  less  control  over  it  than  a  human  helmsman.  By  the  virtue  of  their  positions.  Naval 
Officers  are  used  to  being  in  control  and  the  idea  of  relinquishing  some  of  that  control 
may  be  unnerving.  With  exposure  to  the  system  over  time  and  proven  reliability,  VACS 
use  can  overcome  the  psychological  barriers  that  reside  in  some  Naval  Officers. 
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m.  METHODOLOGY 


A.  EXPERIMENT  OBJECTIVE 

The  objective  of  this  study  is  to  identify  factors  which  affect  the  performance  of 
specific  commercial-off-the-shelf  speech  recognition  software  when  used  for  ship  control 
purposes.  Specific  factors  examined  include  the  effects  of: 

•  A  restricted  vocabulary  versus  a  large  vocabulary, 

•  Low  experience  level  conning  officers  versus  high  experience  level 
conning  officers, 

•  Male  versus  female  voices, 

•  Pre-test  training  on  specific  words  versus  no  pre-test  training. 

The  study  builds  upon  previous  SRS  research  and  uses  data  from  a  prior  experiment  to 
examine  the  relevance  of  the  above  factors. 


B.  EXPERIMENTAL  SETTING 

I.  Prior  Research 

As  outlined  in  Chapter  II,  prior  experimentation  with  SRS  sought  to  determine 
factors  that  affected  error  rates.  The  necessity  for  this  follow-on  study  is  grounded  in  the 
need  to  build  upon  that  previous  experimentation. 

•  An  SRS  with  the  default  20,000  word  vocabulary,  utilized  in  the  previous 
experimentation,  may  not  have  been  well  matched  to  the  conning 
application  under  consideration,  due  to  its  limited  vocabulary  requirement. 
This  study  analyzes  the  impact  of  replacing  the  large  vocabulary  with  a 
small  restricted  conning  vocabulary. 

•  Test  subjects  in  the  previous  study  were  all  very  proficient  male  ship 
handlers  each  with  over  ten  years  of  ship  driving  experience.  Actual  ship 
drivers  in  the  fleet  are  usually  newer  officers,  of  both  ^nders,  with  only 
limited  experience.  The  higher  experience  level  of  the  previous  test 
subjects  or  their  gender  may  have  biased  the  resultant  data.  The  current 
study  uses  test  subjects  with  both  high  levels  and  low  levels  of  ship 
driving  experience  to  determine  what  impact  experience  level  has  upon  the 
SRS  performance.  In  addition,  female  test  subjects  are  introduced, 
although  specific  SRS  performance  variation  did  not  drive  experiment 
design. 

•  The  prior  SRS  study  included  no  additional  system  training  after  the 
establishment  of  each  test  subject’s  profile.  SRS  manufacturers  claim  that 
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additional  system  training  will  improve  the  accuracy  of  the  SRS  [Ref.  36]. 
An  absence  of  additional  training  may  cause  an  increased  number  of 
errors.  This  current  study  addresses  the  issue  by  incorporating  pre¬ 
experiment  system  training  for  some  test  subjects  to  determine  its  value  in 
making  SRS  more  accurate. 

•  Profile  establishment  in  the  earlier  study  took  place  in  the  control  room 
while  actual  testing  was  conducted  in  not  only  the  control  room  but  the 
simulator  as  well.  It  could  be  argued  that  the  profile  established  in  the 
control  room  was  less  effective  in  the  simulator  because  the  ambient  noise 
levels  and  acoustic  qualities  varied  between  these  two  locations.  Ambient 
noise  measurements  revealed  a  16  dB  difference  between  the  two  rooms. 
[Ref.  37]  Further,  an  argument  could  be  made  that  the  control  room,  does 
not  accurately  reflect  the  actual  noise  levels  experienced  on  the  bridge  of  a 
navy  ship.  It  lacks  the  many  electronic  navigation  devices  present  on  the 
bridge  of  a  navy  ship  and  in  the  simulator.  The  current  study  conducts  all 
profile  establishment  and  testing  in  the  simulator. 

These  issues  and  their  implications  for  individual  influence  and  or  combined  interactbn 

justify  a  re-examination  of  the  sources  of  error  to  COTS  SRS  and  form  the  basis  for  this 

study. 


2.  Current  Research 

While  there  are  differences  between  this  study  and  the  previous  investigation,  it  is 
also  important  to  discuss  the  similarities.  For  example,  it  was  important  to  hold  constant 
in  this  study  many  of  the  details  of  the  previous  one  so  that  a  valid  statistical  comparison 
between  the  two  can  be  made.  The  main  difference  between  the  two  studies  is  the  size  of 
the  SRS  vocabulary.  Other  factors  including  the  COTS  SRS  software,  the  experimental 
setting,  the  basic  test  procedure,  and  the  equipment  resembled  the  previous  work  as 
closely  as  possible.  Just  as  in  the  prior  SRS  study,  the  experiment  was  conducted  with 
the  support  of  Marine  Safety  International  (MSI)  facilities  using  Dragon  Naturally 
Speaking  Version  6.0  (DNSV6.0). 


3.  Marine  Safety  International 

Marine  Safety  International  provides  ship  handling.  Bridge  Resource 
Management  (BRM),  Electronic  Chart  Display  Information  System  (ECDIS),  Integrated 
Bridge  Systems  (IBS)  and  Automatic  Radar  Plotting  Aids  (ARPA)  training  courses  for 
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the  U.S.  Navy,  the  U.S.  Coast  Guard,  MSC,  NOAA,  International  Navies  and  coastal 
patrols.  There  are  three  MSI  locations  within  the  United  States;  San  Diego,  CA;  Norfolk, 
VA;  and  Newport,  RI.  The  equipment  at  each  location  is  identical  and  all  training  is 
based  upon  a  common  curriculum.  Each  facility  features  both  a  bridge  wing  simulator 
and  a  full  mission  bridge  simulator.  Only  the  bridge  wing  MSI  simulator  in  Newport  was 
utilized  for  this  study.  Figure  1  shows  the  floor  plan  of  MSI  Newport. 


Figure  1.  MSI  Newport,  RI,  Floor  Plan  from  Ref.  38 


Note  that  the  previous  experiment  took  place  at  the  San  Diego  MSI  and  not  in 
Rhode  Island.  [Ref.  39]  However,  as  stated  above,  aU  of  the  MSI  facilities  are 
sufficiently  similar  with  the  only  detectable  difference  being  the  layout  of  the  building’s 
floor  plan.  [Ref.  38]  Even  the  amount  of  ambient  noise  present  in  the  simulator  at  both 
locations  is  comparable.  Measurements  taken  with  a  Type  2  dB-A  sound  level  meter 
revealed  an  ambient  noise  level  of  64.8  dB  in  the  previous  study  [Ref.  40]  while  the 
ambient  noise  measurements  were  66.2  dB  in  the  Rhode  Island  bridge  wing  simulator. 
[Ref.41]  This  slight  difference  is  acceptable  and  very  realistic  when  put  into  the  context 
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of  actual  ship  driving  in  which  ambient  noise  levels  will  vary  significantly.  Additional 
background  noise  such  as  rain,  fog  horns,  etc.  can  be  added  to  the  simulation  but  this 
feature  was  not  used  during  either  experiment.  All  data  collection  and  test  subject  profile 
establishment  for  this  current  study  took  place  in  the  simulator  with  the  baseline  64.8  dB 
sound  level.  [Ref.  42] 

4.  Dragon  Naturally  Speaking  Version  6.0 

This  study  uses  the  exact  same  speech  recognition  software  as  the  previous 
experiment.  Dragon  Naturally  Speaking  Version  6.0  (DNSV6.0)  which  has  the  following 
characteristics: 

•  Continuous  Speech  Recognition  capabilities, 

•  Speaker  dependence, 

•  Variable  vocabulary  that  allows  the  user  to  select  the  size  of  the 
vocabulary  desired  or  to  create  a  specialized  vocabulary, 

•  Spontaneous  speech  capabilities, 

•  User-friendly  graphic  interfaces  to  facilitate  profile  set  up  and  application 
use. 

This  software  is  designed  to  achieve  a  90  to  98  percent  accuracy  rate  for  most  users 
according  to  its  manufacturer.  DNSV6.0  has  been  top  ranked  seven  times  by  SRS 
Software  reviewers  and  this  current  version  is  recognized  to  be  superior.  [Ref.  42] 

5.  Test  Subjects 

Table  3  contains  the  test- subject  data,  featuring  ten  test  subjects,  five  from  the 
MSI  staff  and  five  from  the  Naval  Surface  Warfare  Officer’s  School  (SWOS).  MSI  test 
subjects  were  all  retired  Navy  Captains  each  with  over  fifteen  years  ship  handling 
experience  and  a  surface  warfare  qualification.  Test  subjects  from  SWOS  were  pre¬ 
department  head  level  surface  warfare  qualified  lieutenants  each  with  fewer  than  four 
years  of  ship  handling  experience.  Two  of  the  low  experience  level  test  subjects  were 
female.  All  other  test  subjects  were  male. 
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SUBJECT 

GENDER 

EXP  LVL 

Source 

swo 

1 

M 

Low 

swos 

Yes 

II 

M 

Low 

swos 

Yes 

III 

M 

Hiah 

MSI  Staff 

Yes 

IV 

M 

■SH 

MSI  Staff 

Yes 

V 

M 

Hiah 

MSI  Staff 

Yes 

VI 

M 

■SH 

MSI  Staff 

Yes 

VII 

M 

Low 

SWOS 

Yes 

VIII 

M 

■SH 

MSI  Staff 

Yes 

IX 

F 

Low 

SWOS 

Yes 

X 

F 

Low 

SWOS 

Yes 

Table  3.  Test  Subject  Data 


6.  Experimental  Procedure 

Test  subjects  were  randomly  scheduled  in  two  hour  blocks,  as  shown  in  the 
MSI/NPS  Test  document  included  in  Appendix  B.  Each  test  subject  received  the  brief 
included  in  Appendix  C  upon  arrival  at  MSI.  Following  the  brief,  test  subjects  moved 
into  the  simulator  where  they  established  DNSV6.0  speech  profiles.  In  addition  to  the 
standard  profile  establishment,  all  but  three  test  subjects  underwent  specialized  training 
on  two  words  that  the  experiment  revealed  as  having  a  high  incidence  of  error,  “rudder” 
and  “starboard”.  Further  details  are  given  in  Chapter  IV. 

Next,  as  conning  officers  of  a  CG-47  class  Guided  Missile  Cruiser,  the  test 
subjects  participated  in  three  different  scenarios.  To  ensure  no  bias  based  on  the  scenario 
order,  the  sequence  in  which  these  scenarios  were  presented  was  randomized  and  varied 
for  each  test  subject.  Each  scenario  included  approaching  a  pier  and  then  getting 
underway  from  that  same  pier.  Test  Subjects  wore  a  SHURE  UEX/S  Standard  Wireless 
Microphone  and  issued  all  orders  verbally.  The  wireless  microphone  transmitted  the 
verbal  commands  to  a  Sony  VAIO  FX250  Eaptop  computer  located  in  the  control  room, 
approximately  20  feet  away.  The  UEX/S  has  an  RE  carrier  Frequency  Range  of  554  to 
865  MHz  with  an  effective  range  of  100  meters.  The  VAIO  laptop  was  loaded  with 
DNSV6.0  and  converted  all  verbal  orders  into  text  for  analysis. 

During  the  simulator  trials,  test  subjects  received  no  repeat-backs  of  orders. 
Although  this  is  a  distinct  departure  from  actual  conning  procedures,  use  of  repeat  backs 
and  acknowledgements  yield  no  insight  into  SRS  performance.  A  system  operator  in  the 
control  room  performed  helm  and  lee  helm  functions. 
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C.  EXPERIMENT  EXPECTATIONS 

•  El:  SRS  with  smaller  vocabularies  are  more  accurate  (fewer  errors) 
than  SRS  with  large  vocabularies 

A  major  experimental  expectation  of  this  study  to  was  whether  a  small  SRS 
vocabulary  produces  fewer  errors  than  a  large  SRS  vocabulary.  As  discussed,  the  smaller 
vocabulary  results  in  fewer  choices  for  the  software  when  attempting  to  match  spoken 
words  to  the  library.  This  in  turn  should  result  in  fewer  software  errors  due  to 
misinterpretation. 

•  E2:  The  experience  level  and/or  gender  of  the  SRS  user  will  have  no 
impact  on  the  performance  of  SRS 

For  acceptable  operability  in  the  fleet,  the  experience  level  of  the  user  should  have 
no  impact  on  the  SRS  performance.  As  long  as  conning  officers  use  standard  commands, 
the  system  should  be  able  to  recognize  the  verbal  orders  and  convert  them  b  text 
regardless  of  the  conning  officer’s  age,  gender,  or  experience  level.  One  exception  to  this 
may  be  caused  by  stress-related  pitch  elevations  in  the  user’s  voice.  Less- experienced 
conning  officers  may  tend  to  be  more  easily  excited  while  ship  driving.  To  counter  this 
effect  during  testing,  the  ship  driving  scenarios  have  a  very  low  degree  of  difficulty  and 
each  test  subject  is  instructed  to  remain  calm  throughout  the  scenario  as  their  ship  driving 
abilities  are  not  the  focus. 

•  E3:  Test  subjects  who  undergo  additional  SRS  training  will  have  a 
lower  error  rate  than  those  who  do  not  undergo  the  additional 
training 

Both  ScanSoft,  Inc.  [Ref.  43]  and  conclusions  from  the  previous  SRS  study 
suggest  that  additional  training  prior  to  SRS  use  improves  accuracy.  It  is  therefore 
expected  that  test  subjects  who  receive  additional  training  will  experience  a  lower  error 
rate  than  those  that  do  not  perform  the  extra  training. 

Experimentation  was  conducted  over  a  three  day  period  beginning  October  27, 
2003.  No  problems  with  hardware  or  software  were  encountered  during  the  test 
procedure,  and  all  data  was  successfully  compiled  for  analysis  in  the  next  chapter. 
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IV.  ANALYSIS 


A.  DATA  RESULTS 

A  total  of  30  trials  were  conducted  (10  test  subjects  with  3  trials  each).  Appendix 
D  contains  the  full  data  worksheet.  Analysis  suggests  four  types  of  errors,  three  of  which 
are  associated  with  the  VACS  and  one  with  the  conning  officer.  They  are  described  as 
follows: 

•  Type  1  error:  SRS  uses  the  wrong  word 

In  this  instance  a  misinterpretation  of  the  verbal  input  was  made  by  the  SRS  and 
an  incorrect  word  was  substituted  for  the  appropriate  word. 

•  Type  2  error:  SRS  adds  a  word  not  spoken 

This  error  occurs  when  the  SRS  believes  a  word  is  uttered  when  it  was  not.  Some 
instances  where  this  error  type  may  occur  include  inadvertent  contact  with  the 
microphone  causing  a  crackle  noise,  clearing  of  the  throat,  or  any  other  superfluous 
background  noise  detected  by  the  microphone  and  transmitted  to  the  SRS. 

•  Type  3  error:  SRS  does  not  acknowledge  a  spoken  word 

In  this  case,  the  SRS  fails  to  receive  the  incoming  acoustic  signal.  Some 
extraneous  causes  of  this  error  include  a  microphone  failure,  overpowering  background 
noise,  a  very  soft  spoken  or  extremely  brief/fast  verbal  signal. 

•  Type  4  error:  A  nonstandard  command  is  used 

It  is  possible  for  the  conning  officer,  if  not  properly  trained,  to  use  an  incorrect 
command  format  thereby  representing  improper  syntax  for  the  SRS  to  interpret.  This 
occurs  because  the  conning  vocabulary  stored  in  the  SRS  memory  contains  only  the 
words  and  phrases  of  standard  commands.  As  a  result  the  SRS  will  be  unable  to  correctly 
identify  a  word  not  contained  in  the  restricted  vocabulary  list. 

The  proper  use  of  standard  commands  is  paramount  on  the  bridge  of  any  naval 
vessel  and  only  through  extensive  training  does  a  conning  officer  become  proficient. 
Because  Type  Four  errors  reflect  insufficient  conning  officer  training,  these  errors  are  not 
associated  with  measurements  of  SRS  effectiveness.  The  remaining  three  types  of  errors 
are  suitable  metrics  for  SRS,  and  are  aggregated  for  this  analysis.  The  reason  for 

combining  the  errors  rather  than  examining  each  type  independently  is  to  reflect  the 
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overall  system  performance  and  so  that  data  from  this  study  can  be  compared  to  the 
earlier  research  in  which  the  error  types  were  also  combined. 


B.  DATA  ANALYSIS 

Before  performing  statistical  analysis  of  the  experiment  outcomes  it  is  necessary 
to  ensure  that  the  experiment  meets  all  the  prerequisites  of  sound  design.  As  mentbned 
in  Chapter  III,  randomizing  the  sequence  in  which  the  scenarios  were  presented  and  the 
assignment  of  test  subjects  to  time  slots  removes  the  chance  that  a  certain  order  of  events 
could  affect  the  experiment  outcome.  In  other  words,  randomization  ensures  that  chance 
governs  the  results  and  not  any  characteristic  of  the  experimental  procedure  or  the 
judgment  of  the  experimenter.  [Ref.  44]  Having  randomized,  the  next  step  is  to 
determine  if  normality  existed.  The  result  of  this  analysis  is  depicted  in  a  standard 
normal  quantile  plot,  using  statistical  software  package  S-Plus.  [Figure  2] 

The  adequacy  of  a  normal  model  for  describing  a  distribution  of  data  is 
best  assed  by  a  normal  quantile  plot.  A  pattern  on  such  a  plot  that  deviates 
substantially  from  a  straight  line  indicates  that  the  data  are  not  normal. 

[Ref.  45] 

The  horizontal  axis  of  this  plot  is  numbered  from  -2  to  2.  The  zero  point 
represents  the  median  data  point.  On  either  side  of  this  median  value  are  the  next  higher 
or  lower  values.  In  this  case,  the  standard  normal  quantile  plot  shows  a  relatively  normal 
distribution  of  SRS  errors.  A  normal  distribution  is  one  in  which  the  data  points  begin  in 
the  lower  left  comer  of  the  graph  and  follow  an  imaginary  line  to  the  upper  right  comer 
of  the  graph.  Normal  distributions  approximate  the  outcomes  of  chance.  This  is  an 
important  factor  because  any  further  analysis  of  variance  (ANOVA)  or  sample  mean 
testing  requires  a  normally  distributed  variable.  [Ref.  46]  These  statistical  inference 
procedures  rely  on  normal  distributions  to  calculate  the  mean  and  standard  deviation 
without  the  influence  of  any  outliers  or  other  non-standard  results. 
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Quantiles  of  Standard  Normal 

Figure  2.  Standard  Normal  Quantile  Plot 

ANOVA  methodology  examines  explained  and  unexplained  variations  in 
performance  measures  to  determine  the  significance  of  a  model.  In  other  words,  we  are 
looking  to  see  if  the  variation  between  the  experimental  results  and  the  normally  expected 
results  differ.  The  distance  between  the  data  points  and  their  mean  value  is  the  variation. 
The  measure  of  variation  is  the  distance  between  observation  and  expectation,  tallied  by 
summing  the  squared  differences. 

Dividing  the  sums  of  squares  by  the  appropriate  degrees  of  freedom  yields  an 
estimation  of  mean  square  differences.  The  ratio  of  the  mean  squares  is  an  F- statistic  that 
shows  the  average  amount  of  explained  variation  as  compared  to  the  average  amount  of 
unexplained  variation.  The  larger  the  F- value,  the  more  explained  variation  and  the  less 
unexplained.  If  there  is  no  explanatory  relationship  then  the  ratio  of  explained  variation 
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to  unexplained  variation  will  be  small.  This  supports  the  null  hypothesis  which  assumes 
no  significant  explanation  of  observed  performance.  [Ref.  47] 

After  computing  the  F statistic,  based  on  the  observed  data,  and  comparing  this 
value  to  the  known  F  distribution,  analysis  yields  a  P- value.  This  P- value  is  referred  to  as 
“Pr(F)”  in  the  analysis  charts  that  follow.  The  P- value  is  the  probability  of  observing  the 
results  seen  during  the  experiment  given  that  the  null  hypothesis  is  true.  The  null 
hypothesis  states  that  introduction  of  an  explanatory  variable  will  not  have  an  effect  on 
the  performance  responses  of  the  study.  [Ref.  48]  As  discussed  above,  the  null 
hypothesis  is  that  there  is  no  difference  h  SRS  performance  among  groups  based  on 
vocabulary  size,  experience  level  or  training.  Armed  with  this  knowledge  we  can  apply 
ANOVA  to  each  of  the  expectations  outlined  in  Chapter  III  specifying  whether  evidence 
supports  or  refutes  the  null  hypothesis. 

C.  EXPECTATION  AND  DATA  COMPARISONS 

I.  Expectation  #I 

The  first  expectation  for  this  study  is  that  SRS  performance  with  smaller 
vocabularies  is  more  accurate  (fewer  errors)  than  SRS  performance  with  large 
vocabularies.  The  analysis  of  this  expectation  inquired  that  the  results  from  the  earlier 
SRS  study  using  a  large  vocabulary  be  compared  to  the  data  obtained  in  this  restricted 
vocabulary  study.  The  results  of  this  analysis  are  in  Table  4  and  clearly  indicate  that 
there  is  no  significant  difference  in  performance  among  users  of  the  restricted  and  the 
large  vocabulary.  The  likelihood  of  seeing  these  outcomes  if  there  were  no  difference  in 
SRS  performance  based  on  size  of  vocabulary  is  relatively  high. 


El 

Df 

Sum  of  Sq 

Mean  Sq 

F  Value 

Pr(F) 

Exp  vocab 

1 

0.00130975 

0.001309746 

0.6811449 

0.4147828 

Residuals 

35 

0.06730009 

0.001922860 

Table  4.  ANOVA  for  Expectation  1  (Vocabulary  Size) 
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One  possible  reason  for  the  absence  of  a  significant  difference  between 
vocabulary  sizes  has  to  do  with  the  test  procedure.  According  to  the  DNSV6.0  user’s 
manual,  the  fastest  way  for  the  software  to  “learn”  is  for  users  to  make  corrections  to 
errors  as  they  occur.  The  testing  procedure  used  in  this  experiment  did  not  include  the 
use  of  the  DNSV6.0  correction  capabilities.  Test  subjects  made  no  corrections  during  the 
experiment,  potentially  lengthening  the  learning  curve  for  the  software.  In  fact,  a  review 
of  the  raw  data  shows  several  instances  where  the  same  test  subject  produced  errors  on 
the  same  phrase  multiple  times.  Perhaps  using  the  correction  function  after  the  original 
error  averts  follow-on  errors. 

Taking  the  level  of  accuracy  into  consideration  further  explains  these  results. 
With  either  a  restricted  vocabulary  or  a  large  vocabulary  DNSV6.0  is  ninety  to  ninety- 
nine  percent  accurate  in  most  trials.  The  reduced  vocabulary  trials  make  it  easier  and 
possibly  faster  for  the  software  to  match  words  to  spoken  language,  but  do  not 
necessarily  reduce  errors  caused  by  poor  pronunciation,  background  noise,  and  other  non- 
SRS  lelated  factors.  A  small  percentage  of  non-SRS  related  errors  occur  in  each  trial. 
These  are  not  eliminated  by  reducing  the  size  of  the  vocabulary.  The  exact  number  of 
non-SRS  related  errors  varies  from  subject  to  subject  and  therefore  is  not  accounted  for  in 
this  experiment.  However,  while  the  reduced  vocabulary  SRS  did  not  significantly 
reduce  the  number  of  errors,  there  are  other  benefits  to  using  an  SRS  with  a  small 
vocabulary.  The  reduced  processing  time  associated  with  small  vocabulary  SRS  makes 
the  software  more  efficient  and  responsive.  This  potential  benefit  alone  makes  smaller 
vocabulary  SRS  more  desirable  for  highly  dynamic  applications. 

Finally,  as  reported  in  Chapter  I,  SRS  uses  statistical  language  models  to  predict 
the  likelihood  of  a  word  occurring  in  a  sentence.  This  experiment  however,  did  not  use 
normal  sentence  structure  and  grammar.  It  used  standard  naval  commands  which  do  not 
conform  to  the  rules  of  the  statistical  language  model.  The  statistical  model  therefore  lost 
some  of  its  predictive  power. 
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2.  Expectation  #2 

Expectation  number  two  states  that  the  experience  level  and/or  gender  of  the  SRS 
user  has  no  impact  on  the  performance  of  SRS.  While  it  was  always  the  aim  of  this  study 
to  examine  the  role  of  experience  level,  examining  the  issue  of  gender  was  not  part  of  the 
original  design.  The  comment  of  a  simulator  operator  at  Surface  Warfare  Officer’s 
School  led  to  the  incorporation  of  the  gender  issue.  One  of  the  SWOS  instructors  stated 
that  the  system  seemed  to  make  more  errors  with  female  operators  than  it  did  with  male. 
[Ref.  49]  Independent  and  combined  analysis  of  these  variables  was  completed  to  ensure 
there  are  no  confounding  effects. 


Figure  3.  Confounding  Variables 

Two  or  more  variables  are  confounded  when  their  effects  are  mixed  together.  [Ref.  50] 
Tables  5  through  7  show  the  analysis  of  experience  level,  gender  and  then  gender  and 
experience,  respectively. 


E2 

Df 

Sum  of  Sq 

Mean  Sq 

F  Value 

Pr(F) 

Experience 

1 

0.0028421 

0.002842133 

0.6373282 

0.4313994 

Residuals 

28 

0.1248646 

0.004459450 

Table  5.  ANOVA  for  Expectation  2  (Experience  Eevel) 
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E2 

Df 

Sum  of  Sq 

Mean  Sq 

F  Value 

Pr(F) 

Gender 

1 

0.0144058 

0.01440583 

3.637775 

0.06678888 

Residuals 

28 

0.1108818 

0.00396007 

Table  6.  ANOVA  for  Expectation  2  (Gender) 


E2 

Df 

Sum  of  Sq 

Mean  Sq 

F  Value 

Pr(F) 

Gender 

1 

0.0144058 

0.01440583 

3.643163 

0.0669833 

Experience 

1 

0.0041182 

0.00411819 

1.041471 

0.3165372 

Residuals 

27 

0.1067636 

0.00395421 

Table  7.  ANOVA  for  Expectation  2  (Experience  and  Gender) 


As  seen  above,  there  is  no  significant  difference  between  test  subjects  based 
solely  on  experience  level.  A  P- value  of  .431  fails  to  refute  the  null  hypothesis  that  the 
experience  level  of  the  conning  officer  does  not  impact  SRS  performance.  Gender 
however,  is  a  significant  factor  regardless  of  the  experience  level.  (P-value  of  .066)  It 
should  be  noted  though,  that  the  sample  size  is  insufficient  to  make  serious  SRS 
generalizations.  The  sample  size  determines  the  margin  of  error  and  with  only  two 
female  test  subjects  our  margin  of  error  is  very  high.  [Ref.  51]  A  larger  female  sample 
size  was  not  obtained  as  stated  above,  because  the  original  focus  of  this  study  did  not 
include  the  issue  of  gender. 

The  results  of  the  gender  and  experience  level  analysis  (Eigure  3)  show  a  general 
increase  in  the  error  rate  as  one  moves  from  experienced  males  to  inexperienced  males  to 
females.  However,  because  there  were  no  high  experience  level  female  test  subjects  the 
possibility  of  confounding  variables  exists.  Because  all  of  the  female  test  subjects  were 
inexperienced,  it  is  unclear  if  the  observed  effects  are  due  solely  to  their  gender  or  a 
combination  of  low  experience  and  gender.  Eurther  research  in  this  area  is  necessary  to 
separate  the  two  variables.  Another  noticeable  trend  in  Eigure  3  is  that  there  is  a  greater 
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degree  of  variability  among  the  female  test  subjects.  There  is  no  apparent  cause  for  this 
increased  data  spread  and  again  serves  to  show  that  additional  research  with  female  SRS 
users  is  warranted. 


Figure  4.  Experience  Level  And  Gender  Plot 


The  dotted  line  shown  in  Figure  4  represents  the  median  values  in  each  category. 
The  two  trials  with  the  highest  error  rate  in  the  experienced  male  category  (circled)  were 
caused  by  a  single  test  subject  with  a  heavy  New  England  accent.  These  two  outliers 
increase  the  mean,  but  not  the  median.  The  median,  a  resistant  measure  against  outliers, 
shows  an  increase  as  experience  level  decreases.  This  is  not  significant  however  because 
the  means  and  errors  used  in  the  Ftest  are  not  resistant  to  outliers.  Using  the  median 
values  solidifies  the  proposition  that  gender  impacts  SRS  performance. 

Without  taking  gender  in  to  account  however,  there  is  no  significant  difference 
between  test  subjects  based  solely  on  experience  level.  There  may  appear  to  be  a 
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difference  in  Figure  4,  but  using  the  mean  values,  it  is  not  significant.  This  confirms  our 
expectations  and  demonstrates  that  SRS  performance  has  little  to  do  with  the  experience 
level  of  the  conning  officer. 

3.  Expectation  #3 

Expectation  number  three  suggested  that  SRS  training  has  no  impact  on  error 
rates.  Figure  5  depicts  a  side  by  side  box  plot  and  shows  a  significant  correlation 
between  error  rates  and  training  supporting  our  expectation.  The  full  details  of  the 
additional  SRS  training  were  presented  in  Chapter  III. 


Figure  5.  Training  vs.  No  Training  Error  Rates 


According  to  this  analysis,  with  a  P- value  of  .05,  SRS  capable  of  individual  user 
training  will  produce  fewer  errors.  This  is  an  important  design  characteristic  to  consider 
for  future  SRS  implementation.  Current  Navy  training  simulators  with  SRS  technology 
do  not  use  individual  user  training.  [Ref.  52]  It  would  seem  however  that  any  SRS 
system  designed  for  ship  control  purposes  should  incorporate  a  user  training  feature. 

Another  noticeable  difference  between  the  two  sets  of  data  is  the  spread  of  the 
results.  The  small  white  boxes  that  surround  the  median  error  rate  represent  the 
interquantile  range  in  which  fifty  percent  of  the  data  falls.  Notice  that  the  “training” 
white  box  is  the  smaller  of  the  two  and  that  there  is  no  overlap  in  the  area  covered  by  the 

boxes.  The  “no  training”  group  consists  of  only  nine  trials  while  the  “training”  group  had 
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21  trials.  The  data  are  much  more  tightly  grouped  in  the  “training”  trials,  despite  the 
larger  number  of  trials.  This  indicates  that  additional  training  with  SRS  eliminates  some 
of  the  variation  and  produces  a  more  accurate  and  well-defined  result. 

As  encouraging  as  these  results  are,  it  must  be  noted  that  the  method  of  training 
used  in  this  study  is  not  thorough  enough.  The  training  is  limited  to  only  two  words; 
“starboard”  and  “rudder”.  Test  subjects  repeated  each  of  these  words  several  times  until 
the  software  established  firm  models  for  each  word.  These  words  were  selected  due  to 
their  high  rate  of  error  observed  in  pre-trial  exercises.  However,  to  truly  assess  the  value 
of  system  training,  the  training  should  include  most  if  not  all  of  the  words  and  phrases  in 
the  restricted  conning  vocabulary.  A  more  comprehensive  training  method  may  reduce 
error  rates  even  further. 
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V.  CONCLUSION 


A.  SUMMARY 

The  U.S.  Navy  of  today  emphasizes  cost  cutting  through  reductions  in  manpower 
numbers.  One  of  the  key  components  enabling  this  reduction  of  the  workforce  b  the 
substitution  of  technology  for  watchstanders.  This  SRS  study  shows  the  practicality  of 
using  commercial- off- the- shelf  speech  recognition  software  for  ship  control  purposes 
thereby  demonstrating  the  prospect  of  eliminating  bridge  watchstanders.  SRS  is  already 
used  in  the  military  training  environment  and  could  be  adapted  for  operational  use  as 
well.  Despite  the  fact  that  the  technical  feasibility  of  SRS  implementation  is  very  high,  a 
number  of  questionable  psychological  barriers  remain  and  may  only  be  overcome 
through  the  proven  reliable  usage  of  SRS  over  time.  Previous  experimentation  with  SRS 
led  to  the  identification  of  several  areas  that  required  follow-on  research.  This  study, 
through  the  use  of  a  controlled  experiment  using  COTS  SRS  addresses  many  of  those 
outstanding  areas.  The  results  of  this  study  show  that: 

•  The  experience  level  of  a  conning  officer  has  no  impact  on  SRS 
performance;  however,  in  this  experiment,  limited  number  of  trials 
indicate  that  gender  may  make  a  difference.  Female  participants 
experienced  more  SRS  errors  than  did  their  male  counterparts. 

•  SRS  with  restricted  vocabulary  performs  no  better  than  SRS  with  large 
vocabularies. 

•  Following  the  user  profile  establishment,  individual  user  training  on  two 
specific  words  reduces  error  rates  significantly. 

B.  LIMITATIONS  OF  STUDY 

Some  of  the  limitations  of  this  study  are  found  by  examining  the  testing 
environment.  While  the  use  of  the  MSI  simulator  facility  in  Newport  Rhode  Island  was 
conducive  to  experimentation,  the  simulator  does  not  capture  all  of  the  nuances  of  an 
actual  shipboard  environment.  Background  noises  such  as  additional  watch  stander 
conversations,  ships  and  tug  boats  whistles,  wind  noise  were  not  examined  and  may 
impact  the  performance  of  a  COTS  SRS  product.  Additionally,  the  use  of  a  wireless 
microphone  in  the  simulator  was  made  possible  due  to  the  lack  of  competing  signals. 

Onboard  ship,  the  radio  frequency  environment  could  cause  signal  conflicts. 
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A  second  limitation  is  the  small  number  of  test  subjects.  A  larger  pool  of  test 
subjects  would  increase  the  power  of  obtained  results.  A  major  shortcoming  was  that 
only  two  of  the  test  subjects  were  female. 


C.  PROPOSED  FOLLOW-ON  RESEARCH 

Due  to  these  limitations  as  well  as  new  insight  gained  during  this  study,  there  is  a 
need  for  additional  research  in  the  area  of  COTS  SRS  used  for  ship  control  purposes.  The 
paragraphs  below  propose  several  areas  for  follow-on  research. 

•  Use  a  large  pool  of  both  high  and  low  experience  female  test  subjects  to 
determine  the  impact  that  gender  has  upon  SRS  performance. 

•  Conduct  tests  underway  aboard  actual  naval  vessels  to  determine  the 
impact  of  shipboard  background  noise  on  SRS. 

•  Match  high  and  low  stress  in  experiments  to  gain  insight  into  the  role  that 
excitement  and  changes  in  voice  pitch  play  in  SRS  accuracy. 

•  Consider  SRS  processing  speed  as  a  measure  of  performance  and  research 
the  processing  time  of  a  large  vocabulary  SRS  versus  the  processing  time 
of  a  restricted  vocabulary  SRS. 

•  Study  SRS  user  training  to  determine  its  benefit,  specifically  analyzing 
whether  training  should  be  conducted  on  all  words  in  a  restricted 
vocabulary  SRS. 

Based  on  the  results  of  this  study,  with  further  testing  and  development  COTS 
SRS  is  a  viable  alternative  to  reduce  shipboard  manning  if  it  incorporates  individual  user 
training,  redundancies,  and  safe-guards  as  discussed  in  Chapters  II  and  III.  Its  initial  use 
could  be  limited  to  open  ocean  transits  until  the  Navy  gains  confidence  in  eliminating  the 
helmsman  and  lee  helmsman  watchstanders.  Shore-based  SRS  ship  handling  simulators 
like  the  ones  currently  in  use  continue  to  expose  and  train  new  “ship  drivers”  to  the 
intricacies  of  SRS  use.  These  measures  can  help  to  ensure  a  smooth  transition  to  SRS 
based  ship  control.  The  Chief  of  Naval  Operations  guidance  and  the  Naval 
Transformation  Roadmap  both  endorse  inserting  technology  to  develop  manpower- saving 
capabilities.  Speech  Recognition  Software  is  precisely  the  type  of  technology  that  can 
fulfill  this  requirement. 
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APPENDIX  A.  STANDARD  COMMANDS 


Standard  commands  will  vary  depending  on  the  type  of  ship.  Listed  below  is  the  format  for  the 
most  common  standard  commands  used  by  naval  surface  vessels. 

Engine  orders 

WHICH  ENGINE\  DIRECTIOt^.  AMOUNT^ 

WHICH  ENGINE^  stop . 

Starboard  engine  /  Port  engine  /  All  engines 
Ahead  /  Back 

1/3  /  2/3  /  standard  /  full  /  flank  /  or  by  pitch  (i.e.  “20%  pitch”) 

Steering  orders: 

DIRECTION^ .  AMOUNT^,  steady  on  COURS^’"'  -  (Used  for  course  changes  greater  than  1 0  degrees) 
Come  DIRECTION^  steer  course  COURSE  -  (Used  for  course  changes  less  than  1 0  degrees) 
Hard  DIRECTION^  rudder,  steady  on  COURSE^-  (Used  for  extremis  steering) 

Right  /  Left 

Standard  rudder  /  full  rudder  /  or  number  of  degrees  (i.e.  “10  degrees  rudder”) 

Any  heading  between  000  and  359. 

A  steady  on  course  is  optional. 

Additionai  standard  commands  used  for  steering: 

Rudder  amidships 
Steady  as  she  goes 
Meet  her 
Mind  your  helm 
Shift  your  rudder 

EASE  or  INCREASE  vour  rudder  to  DIRECTION^ ,  AMOUNf 
Right  /  Left 

Standard  rudder  /  full  rudder  /  or  number  of  degrees  (i.e.  “10  degrees  rudder”) 
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APPENDIX  B.  MSI/NPS  TEST  DOCUMENT 


MARINESAFETY  INTERNATIONAL  INTER- OFFICE  MEMORANDUM 
1 0/24/03 

To:  Distribution 

From:  Fred  Bronaugh,  CAPT  DSN  (Ret.) 

Subject:  US  Navai  Post  Graduate  Schooi  Voice  Recognition  Experiment  (change  2) 

1 .  From  Monday  the  27*^  of  October  to  Wednesday  the  29*^  we  wiii  be  hosting  a  Voice 
Recognition  experiment  for  the  NPGS.  Lt  Rob  Kuffei  wiii  be  test  director  and  wiii  be  using  very 
experienced  mariners  (us)  and  experienced  mariners  (SWOS  instructors)  in  the  experiments. 

2.  The  BWS  wiii  be  the  experiment  site  and  three  different  docking  scenarios  in  NCR 
wiii  be  used.  Expect  the  three  wiii  be  (A)  moor  and  U/W  from  2S,  (B)  moor  and  U/W  from  3P 
and  (C)  Moor  and  U/W  from  7S.  Aii  runs  wiii  be  no  current,  no  wind  and  wiii  start  about  two  ship 
iengths  from  the  pier.  The  CG-47  ciass  wiii  be  the  own  ship. 


3.  Scheduie  of  events  and  tasking: 


Mondav  270CT 

0800-0900:  Set-up 

Test  Subiect 

Seauence 

0900-1 1 00:  TEST  SUBJECT  A 
1100-1200:  Lunch 

LT  Reichenau 

[A,B,C] 

1200-1400:  TEST  SUBJECT  B 

LT  Muiiins 

[B,A,C] 

1400-1600:  TEST  SUBJECT  C 

1 600-1700:  Fiex  time 

Dan  Liuzzi 

[C,B,A] 

Tuesdav  280CT 

0800-1000:  TEST  SUBJECT  D 

Bud  Weeks 

[C,A,B] 

1000-1200:  TEST  SUBJECT  E 
1200-1300:  Lunch 

Dave  Kane 

[A,B,C] 

1300-1500:  TEST  SUBJECT  F 

Ed  Lynch 

[B,A,C] 

1500-1700:  TEST  SUBJECT  G 

LT  Rickwait 

[C,B,A] 

Wednesdav  290CT 

0800-1000:  TEST  SUBJECT  H 

Fred  Bronaugh 

[A,C,B] 

1000-1200:  TEST  SUBJECT  1 
1200-1300:  Lunch 

LT  Baicirak  (femaie) 

[B,C,A] 

1300-1500:  TEST  SUBJECT  J 
1500-1700:  Fiex  time/Wrap-up 

LTjg  Krug  (femaie) 

[A,B,C] 

4.  The  operator  wiii  maintain  controi  of  rudder  and  engines,  commands  wiii  be  reiayed 
by  hand/headset.  The  objective  is  to  evaiuate  the  effectiveness  and  reiiabiiity  of  the  software 
not  to  evaiuate  shiphandiing  skiii.  Setup  wiii  be  the  responsibiiity  of  Lt  Kuffei,  Caivin  you  shouid 
be  ready  to  provide  assistance. 

Thanks 

Fred 

Distribution:  Ed.  Bud.  Dan  L.  Dave.  Pete.  Georoe  K.  Tom.  Jim  and  Caivin 


35 


THIS  PAGE  INTENTIONALLY  LEET  BLANK 


36 


APPENDIX  C.  TEST  SUBJECT  BRIEE 


Thank  you  for  agreeing  to  participate  in  this  study.  The  purpose  of  this  study  is  to 
determine  the  reiiabiiity  of  commerciai  off  the  sheif  speech  recognition  software  when  used  for 
ship  controi  purposes.  You  wiii  be  asked  to  conn  a  simuiated  CG  in  or  out  of  port  whiie  wearing  a 
wireiess  microphone.  It  is  important  that  you  remember  that  your  ship  driving  abiiity  is  NOT  being 
tested.  Try  to  remain  cairn  throughout  your  scenario  and  speak  in  a  ioud  and  ciear  voice.  Try  to 
avoid  contact  with  the  microphone  and  externai  conversations.  If  you  must  say  something  other 
than  an  engine  or  rudder  order  you  may  switch  off  the  microphone  temporariiy.  When  you  turn  it 
back  on  however,  be  sure  to  pause  before  giving  an  order.  Your  verbai  commands  wiii  be 
transmitted  to  a  iaptop  computer  that  wiii  convert  them  into  text.  The  entire  experiment  shouid 
take  about  2  hours. 

The  first  step  wiii  be  to  set-up  a  user  profiie  on  the  computer.  [SET  UP  PROFILE] 

The  format  for  standard  commands  that  you  should  use  for  this  experiment  is  as  follows: 

Engine  orders 

WHICH  ENGINE\  DIRECTIOt^.  AMOUNf 

WHICH  ENGINE^  stop . 

Starboard  engine  /  Port  engine  /  All  engines 
Ahead  /  Back 

1/3  /  2/3  /  standard  /  full  /  flank  /  or  by  pitch  (i.e.  “20%  pitch”) 

Steering  orders: 

DIRECTION^ .  AMOUNT^,  steady  on  COURS^  '^  -  (Used  for  course  changes  greater  than  1 0  degrees) 
Come  DIRECTI0N\  steer  course  COURS^  -  (Used  for  course  changes  less  than  1 0  degrees) 
Hard  DIRECTION^  rudder,  steady  on  COURSE^-  (Used  for  extremis  steering) 

Right  /  Left 

Standard  rudder  /  full  rudder  /  or  number  of  degrees  (i.e.  “10  degrees  rudder”) 

Any  heading  between  000  and  359. 

A  steady  on  course  is  optional. 

All  other  standard  commands  remain  unchanged. 

Rudder  Amidships 

Ease  or  increase  your  rudder  to  ... 

Steady  as  she  goes 
Etc. 
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APPENDIX  D:  SRS  SPECIALIZED  VOCABULARY 


0 

28 

1 

29 

2 

30 

3 

31 

4 

32 

5 

33 

6 

34 

7 

35 

8 

36 

9 

37 

10 

All 

11 

All  engines 

12 

ahead 

13 

All  engines  back 

14 

Amidships 

15 

Rudder 

16 

amidships 

17 

And 

18 

As 

19 

Back 

20 

Course 

21 

Come 

22 

Come  right 
steer  course 

23 

Come  left  steer 

24 

course 

25 

Degrees 

26 

27 

Ease 

Ease  your 

Port 

rudder  to  left 

Port  engine 

Ease  your 

ahead 

rudder  to  right 

Port  engine 

Engine 

back 

Engines 

Right 

Flank 

Right  full  rudder 

For 

Right  standard 

Full 

rudder 

Goes 

Rudder 

Hard 

Shift 

Hard  right 

She 

rudder 

Standard 

Hard  left  rudder 

Starboard 

Helmsman 

Starboard 

Increase 

engine  ahead 

Increase  your 
rudder  to  right 

Starboard 
engine  back 

Increase  your 

Steady 

rudder  to  left 

Steady  as  she 

Indicate 

goes 

Knots 

Steer 

Left 

Stop 

Left  full  rudder 

To 

Left  standard 

Turns 

rudder 

Two  thirds 

One  third 

You 

Percent 

Percent  pitch 

Pitch 

Your 
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APPENDIX  E.  TEST  DATA  SPREAD  SHEET 


All  Error  Types 


trial  #  errors 


1 


2 


Summary  Data 


#  orders 

P[error] 

subj 

exp  level 

gender 

scenario 

sequence 

48 

0.083333 

1 

low 

A 

1 

46 

0.086957 

1 

low 

B 

2 

54 

0.074074 

1 

low 

C 

3 

40 

0.125 

low 

B 

1 

48 

0.104167 

wm 

low 

A 

2 

59 

0.101695 

low 

C 

3 

48 

0.104167 

wm 

high 

C 

1 

32 

0.09375 

IDH 

high 

B 

2 

28 

0.107143 

wm 

high 

A 

3 

72 

0.180556 

IV 

high 

C 

1 

59 

0.118644 

IV 

high 

A 

2 

51 

0.215686 

IV 

high 

B 

3 

39 

0.153846 

V 

high 

A 

1 

33 

0.030303 

V 

high 

B 

2 

43 

0.093023 

V 

high 

C 

3 

48 

0.020833 

VI 

high 

B 

1 

51 

0.039216 

VI 

A 

2 

52 

0.057692 

VI 

high 

C 

3 

52 

0.019231 

VII 

low 

C 

1 

34 

0.029412 

VII 

low 

B 

2 

32 

0 

VII 

low 

A 

3 

33 

0.090909 

VIII 

high 

A 

1 

39 

0.282051 

VIII 

high 

C 

2 

31 

0.096774 

VIII 

high 

B 

3 

31 

0.225806 

IX 

low 

F 

B 

1 

45 

0.177778 

IX 

low 

F 

C 

2 

12 

0.25 

IX 

low 

F 

A 

3 

61 

0.114754 

X 

low 

F 

A 

1 

41 

0.02439 

X 

low 

F 

B 

2 

50 

0.04 

X 

low 

F 

C 

3 
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